Julian Minder

mentions 1 type Person feed RSS

23:35

2026-06-10

lesswrong.com

artificial-intelligence

Thoughts on Claude Fable's silent safeguards

Anthropic released Claude Fable 5, its most capable Mythos-class model, with new safeguards that silently limit the model's effectiveness for requests related to frontier LLM development without notif…

// co-occurs with top 4 entities

Anthropic 1 Claude Fable 5 1 Opus 4.8 1 Mythos 1